A Relational Unsupervised Approach to Author Identification
نویسندگان
چکیده
In the last decades speaking and writing habits have changed. Many works faced the author identification task by exploiting frequentist approaches, numeric techniques or writing style analysis. Following the last approach we propose a technique for author identification based on First-Order Logic. Specifically, we translate the complex data represented by natural language text to complex (relational) patterns that represent the writing style of an author. Then, we model an author as the result of clustering the relational descriptions associated to the sentences. The underlying idea is that such a model can express the typical way in which an author composes the sentences in his writings. So, if we can map such writing habits from the unknown-author model to the known-author model, we can conclude that the author is the same. Preliminary results are promising and the approach seems practicable in real contexts since it does not need a training phase and performs well also
منابع مشابه
Unsupervised Author Identification and Characterization
Author identification is a hot topic, especially in the Internet age. Following our previous work in which we proposed a novel approach to this problem, based on relational representations that take into account the structure of sentences, here we present a tool that computes and visualizes a numerical and graphical characterization of the authors/texts based on several linguistic features. Thi...
متن کاملA Latent Dirichlet Model for Unsupervised Entity Resolution
Entity resolution has received considerable attention in recent years. Given many references to underlying entities, the goal is to predict which references correspond to the same entity. We show how to extend the Latent Dirichlet Allocation model for this task and propose a probabilistic model for collective entity resolution for relational domains where references are connected to each other....
متن کاملIdentification of Power Stripping Resources with Fuzzy Cluster Dynamic Approach (Case Study: West Azerbaijan Province)
Reducing electric power theft is a significant part of the potential benefits of implementing the concept of smart grid. This paper proposes a data-based approach to identify locations with unusual electricity consumption. The new distance-based method classifies the new data as violator costumers, if their distance is long to the primary consumption data. The proposed algorithm determines the ...
متن کاملطبقه بندی و شناسایی رخسارههای زمینشناسی با استفاده از دادههای لرزه نگاری و شبکههای عصبی رقابتی
Geological facies interpretation is essential for reservoir studying. The method of classification and identification seismic traces is a powerful approach for geological facies classification and distinction. Use of neural networks as classifiers is increasing in different sciences like seismic. They are computer efficient and ideal for patterns identification. They can simply learn new algori...
متن کاملکاربرد سنجش از دور چند زمانی در تعیین سطح زیرکشت
Precision farming aims to optimize field-level management by providing information on production rate, crop needs, nutrients, pest/disease control, environmental contamination, timing of field practices, soil organic matter and irrigation. Remote sensing and GIS have made huge impacts on agricultural industry by monitoring and managing agricultural lands. Using vegetation indices have been wide...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013